Internet Info 1997 December

home *** CD-ROM | disk | FTP | other *** search

/ Internet Info 1997 December / Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso / ietf / urn / urn-archives / urn-ietf.archive.9611 / 000101_owner-urn-ietf _Thu Nov 7 12:18:10 1996.msg < prev next >

Wrap

Internet Message Format | 1997-02-19 | 7KB

Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id MAA00212 for urn-ietf-out; Thu, 7 Nov 1996 12:18:10 -0500 Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id MAA00207 for <urn-ietf@services.bunyip.com>; Thu, 7 Nov 1996 12:18:08 -0500 Received: from josef.ifi.unizh.ch by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA23013 (mail destined for urn-ietf@services.bunyip.com); Thu, 7 Nov 96 12:17:24 -0500 Received: from ifi.unizh.ch by josef.ifi.unizh.ch id <00910-0@josef.ifi.unizh.ch>; Thu, 7 Nov 1996 18:15:35 +0100 Subject: Re: [URN] I18N does not belong in URNs To: moore@cs.utk.edu Date: Thu, 7 Nov 1996 18:15:34 +0100 (MET) Cc: tallen@fsc.fujitsu.com, urn-ietf@bunyip.com In-Reply-To: <199611062032.PAA10773@ig.cs.utk.edu> from "Keith Moore" at Nov 6, 96 03:32:28 pm Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 5192 From: Martin J Duerst <mduerst@ifi.unizh.ch> Message-Id: <"josef.ifi..235:07.10.96.17.15.36"@ifi.unizh.ch> Sender: owner-urn-ietf@services.bunyip.com Precedence: bulk Reply-To: Martin J Duerst <mduerst@ifi.unizh.ch> Errors-To: owner-urn-ietf@bunyip.com Keith Moore wrote: >Perhaps. I don't have as big a problem with this, as long as >there's a discipline about how they are assigned that prevents >URNs from being used as search strings. That's my main worry. >(URNs are also supposed to be transcribable, but I recognize that >only a very small set of characters is transcribable by everyone >on the planet.) For "everyone", you might have to limit it to X and O, or even X and XX :-). Just an additional example: Think about number plates of cars. Lots of non-ASCII characters there all over Asia. >> So for grandfathering, we have two choices. >> >> 1) We interpret it as "direct grandfathering of characters", >> in which case we have to allow a really wide range >> of characters (i.e. ISO 10646). >> 2) We interpret it as "indirect grandfathering", in which >> case the digits, or whatever small set, will be >> enough. > >I suspect there will be some of each. I wouldn't mind "some of each", if this means "about the same amount for everybody on the world". If it comes to mean "a lot for English-speaking, but very little for the rest of the world", I would have to strongly disagree. In theory, I could immagine making a worldwide search of suitable namespaces, and limiting the characters to those appearing in such namespaces. But I guess we would quickly end up with all the major alphabets completed, and the work deciding which of the ~20,000 CJK ideographs is used in a naming scheme, and which not, would not be very rewarding. >But we want the transformations >from "old identifier" to URN to be simple and easy to remember. Back to user-friendliness, in some sense. I have no problems about this. But again, I wouldn't like it if "simple and easy to remember" means that English speaking people just have to type them in, whereas people in the rest of the world, for namespaces that are currently in use there, need list of correspondences they have to look up or remember constantly. >> Choosing ASCII only as for URLs would be very unfair cheating. > >If the names aren't human meaningful, I don't see what you're >complaining about. If I were sure the names would not be meaningful, I would not have much reason to complain. But the URL example shows that restricting people from becomming meaningful is very hard if not impossible. You may agree with the above point, or you may disagree. But either way, ASCII only is unfair. If you think that naming schemes, protocols, standards, some review board, or whatever, can assure that no meaningful stuff is created, then there is no reason to restrict to ASCII. One should assume that Japanese, Russians, Chinese, or whoever, can be as disciplined, or as tighly controlled, as the English-speaking part of the world. On the other hand, if you think that namespaces will get meaningful because of people's nature, and you think it is a bad thing that has to be avoided, there is no other choice but to limit ourselves to the decimal digits and maybe two or three other characters. >> >> I definitely hate to see i18n just being moved around and delayed >> >> by some people. With UTF-8 in URNs, we have made great progress. >> > >> >No, this is a big step backward. Just because there is a new layer >> >being defined doesn't mean that I18N belongs there. >> > >> >I18N *is* important -- and I'd be happy to see a draft document or >> >draft charter for a working group to define a protocol (say an extension >> >of http/html) to resolve human-friendly names into URNs or URLs. >> >> Why is there a need for an additional protocol layer? >> There is no need currently for English, is it? > >Yes, there is. Human-friendly names are inherently ambiguous, >and have ambiguous meanings which require a human being to untangle. [examples removed] >URNs should be precise enough that a human isn't required to >disambiguate the result. I know your examples. But there can be a great deal of human-friendliness, without any need for ambiguity. URLs such as http://www.ibm.com or http://www.icrc.org are examples. My brother, who works at ICRC in Geneva, recently called me. Before he was able to tell me how I could reach him by email, I had the ICRC home page on my browser, and had found the generic email address explanations, and had sent him a mail. Of course, I'm an expert, but there are many people with less Web experience and technical expertise who could do that. If it can be that human-friendly, and that unambiguous, I don't know why I should go to a search service to find the ICRC. And I don't know why people that use other scripts should go though a serach service, while English speaking users don't have to do so. >> And I don't want to have to search through long lists >> of possible answers just because I use Japanese, whereas >> I will get one immediate answer for English. > >The point is that you will often get multiple answers for English. >English names are no more precise than names in other languages. No, but English can be used in URLs, and I get an unambigous result if there is one, in due time. In Japanese, even if the result would be unambiguous, I currently can't get it fast and easily. Regards, Martin.